On Weak Base Hypotheses and Their Implications for Boosting Regression and Classification By

نویسندگان

  • WENXIN JIANG
  • W. JIANG
چکیده

When studying the training error and the prediction error for boosting, it is often assumed that the hypotheses returned by the base learner are weakly accurate, or are able to beat a random guesser by a certain amount of difference. It has been an open question how much this difference can be, whether it will eventually disappear in the boosting process or be bounded by a positive amount. This question is crucial for the behavior of both the training error and the prediction error. In this paper we study this problem and show affirmatively that the amount of improvement over the random guesser will be at least a positive amount for almost all possible sample realizations and for most of the commonly used base hypotheses. This has a number of implications for the prediction error, including, for example, that boosting forever may not be good and regularization may be necessary. The problem is studied by first considering an analog of AdaBoost in regression, where we study similar properties and find that, for good performance, one cannot hope to avoid regularization by just adopting the boosting device to regression.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Weak Base Learners for Boosting Regression and Classiication on Weak Base Learners for Boosting Regression and Classiication

The most basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generate weak hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introducing a geometric concept called the an...

متن کامل

On Weak Base Hypotheses and Their Implications

1 2 When studying the training error and the prediction error for boosting, it is often assumed that the hypotheses returned by the base learner are weakly accurate, or are able to beat a random guesser by a certain amount of diierence. It is has been an open question how much this diierence can be, whether it will eventually disappear in the boosting process or be bounded by a nite amount see,...

متن کامل

Some Results on Weakly Accurate Base Learners for Boosting Regression and Classification

One basic property of the boosting algorithm is its ability to reduce the training error, subject to the critical assumption that the base learners generatèweak' (or more appropriately, `weakly accurate') hypotheses that are better that random guessing. We exploit analogies between regression and classiication to give a characterization on what base learners generate weak hypotheses, by introdu...

متن کامل

Robust Boosting via Convex Optimization: Theory and Applications

In this work we consider statistical learning problems. A learning machine aims to extract information from a set of training examples such that it is able to predict the associated label on unseen examples. We consider the case where the resulting classification or regression rule is a combination of simple rules – also called base hypotheses. The so-called boosting algorithms iteratively find...

متن کامل

Functional Frank-Wolfe Boosting for General Loss Functions

Boosting is a generic learning method for classification and regression. Yet, as the number of base hypotheses becomes larger, boosting can lead to a deterioration of test performance. Overfitting is an important and ubiquitous phenomenon, especially in regression settings. To avoid overfitting, we consider using l1 regularization. We propose a novel Frank-Wolfe type boosting algorithm (FWBoost...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999